Skip to content

Fix naming bug that would incorrectly discard kernels as duplicates#8745

Open
AlexBrownAMD wants to merge 7 commits into
developfrom
users/alexbrownamd/WGMHang
Open

Fix naming bug that would incorrectly discard kernels as duplicates#8745
AlexBrownAMD wants to merge 7 commits into
developfrom
users/alexbrownamd/WGMHang

Conversation

@AlexBrownAMD

Copy link
Copy Markdown
Contributor

Motivation

Fixes a bug that could cause kernels to hang or produce invalid results.

Technical Details

WGMXCC value was ignored in the kernel name (flagged as an internal arg). But, WGMXCC=-1 generates different assembly code from WGMXCC set to a fixed value. One version generates code to handle chunking, the other is regular xcc mapping.

If WGMXCC is left out of the name, kernels with different values are discarded as duplicates, even though they contain different assembly code. This can lead to errors like kernel hangs or validation errors if a kernel with regular xcc mapping is called with arguments meant for the chunking algorithm.

This change includes WGMXCC in kernel name as either WGMXCCn1 for -1 (chunk) or WGMXCC1 for any other fixed value since they would produce the same kernel code.

Test Plan

Tested locally and it fixes the kernel hang that was discovered. Running CI to verify other cases.

@nakajee

nakajee commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Is this the right approach?
Shouldn't we make asm code independent from WorkGroupMappingXCC?

@codecov-commenter

codecov-commenter commented Jun 24, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

❌ Your project status has failed because the head coverage (76.92%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #8745   +/-   ##
========================================
  Coverage    71.34%   71.34%           
========================================
  Files         2613     2613           
  Lines       408997   408997           
  Branches     61114    61114           
========================================
  Hits        291773   291773           
  Misses       95824    95824           
  Partials     21400    21400           
Flag Coverage Δ *Carryforward flag
TensileLite 76.98% <ø> (ø) Carriedforward from fb8fc30
hipBLAS 90.81% <ø> (ø) Carriedforward from fb8fc30
hipBLASLt 41.36% <ø> (ø)
hipCUB 82.68% <ø> (ø) Carriedforward from fb8fc30
hipDNN 86.79% <ø> (ø) Carriedforward from fb8fc30
hipFFT 50.17% <ø> (ø) Carriedforward from fb8fc30
hipRAND 76.12% <ø> (ø) Carriedforward from fb8fc30
hipSOLVER 69.18% <ø> (ø) Carriedforward from fb8fc30
hipSPARSE 86.55% <ø> (ø) Carriedforward from fb8fc30
rocBLAS 48.08% <ø> (ø) Carriedforward from fb8fc30
rocFFT 46.30% <ø> (ø) Carriedforward from fb8fc30
rocRAND 57.07% <ø> (ø) Carriedforward from fb8fc30
rocSOLVER 76.92% <ø> (ø) Carriedforward from fb8fc30
rocSPARSE 72.37% <ø> (ø) Carriedforward from fb8fc30
rocThrust 91.36% <ø> (ø) Carriedforward from fb8fc30

*This pull request uses carry forward flags. Click here to find out more.

Files with missing lines Coverage Δ
...t/tensilelite/Tensile/Common/RequiredParameters.py 100.00% <ø> (ø)
...aslt/tensilelite/Tensile/SolutionStructs/Naming.py 98.33% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@AlexBrownAMD

Copy link
Copy Markdown
Contributor Author

Is this the right approach? Shouldn't we make asm code independent from WorkGroupMappingXCC?

There are 2 xcc mapping algorithms: regular xcc mapping and the new chunking algorithm. For now I think both are needed since we have many kernels tuned using both.

@nakajee nakajee left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@nakajee

nakajee commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Is this the right approach? Shouldn't we make asm code independent from WorkGroupMappingXCC?

There are 2 xcc mapping algorithms: regular xcc mapping and the new chunking algorithm. For now I think both are needed since we have many kernels tuned using both.

OK. We can do in that way for now.
Ideally, I would like to remove support for StreamK + WorkGroupMappingXCC!=-1.

@nakajee

nakajee commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Please make sure CI fail is related to your change or not.

@nakajee

nakajee commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Is this the right approach? Shouldn't we make asm code independent from WorkGroupMappingXCC?

There are 2 xcc mapping algorithms: regular xcc mapping and the new chunking algorithm. For now I think both are needed since we have many kernels tuned using both.

OK. We can do in that way for now. Ideally, I would like to remove support for StreamK + WorkGroupMappingXCC!=-1.

This is because it looks odd WorkGroupMappingXCC is passed as a kernel argument, but it still changes kernel code.
So, ideal situation will be asm kernel is independent from WorkGroupMappingXCC.

@therock-pr-bot

Copy link
Copy Markdown

❌ PR Check — Action Required

Check Status Details
🌿 Branch Name ✅ Pass
📝 PR Title/Description ❌ Fail Error: Title does not follow Conventional Commits style.
Expected: start with a valid type (feat, fix, docs, …).
Desired format: type(optional-scope): short description
───
Error: PR description must reference a JIRA ID, ISSUE ID, or a GitHub closing keyword.
Expected: include a JIRA ID / ISSUE ID line (separator : or -, or omitted; value may be a JIRA key, a number with/without #, or a link), OR a closing keyword + issue reference. Accepted examples:
JIRA ID : TESTAUTO-6039
JIRA ID - #330
JIRA ID #330
ISSUE ID : TESTUTO-3334
ISSUE ID #3334
ISSUE ID - TESTAUTO-3433
ISSUE ID : https://github.com/<org_name>/<repo_name>/issues/1234
Closes #10
Fixes octo-org/octo-repo#100
Resolves: #123
#123
https://github.com/<org_name>/<repo_name>/issues/123
Current: no valid JIRA/ISSUE/closing-keyword reference found
Forbidden Files ✅ Pass
🧪 Unit Test ✅ Pass
🔎 pre-commit ✅ Pass
🚫 Draft PR 🔜 To Be Enabled
🚩 Feature Flag 🔜 To Be Enabled
📊 Code Coverage 🔜 To Be Enabled

⚠️ 1 policy check(s) failed. Please address the issues above before this PR can be Reviewed.

🚫 Please fix the failed policies

  • ❌ PR Title/Description

The Not ready to Review label was added to this PR. Once all policies pass, the label is removed automatically.

📖 Need help? See the Policy FAQ for details on every check and how to fix failures.

@therock-pr-bot

Copy link
Copy Markdown

🚫 Please fix the failed policies before requesting reviews.

The following policy checks failed:

  • ❌ PR Title/Description

The Not ready to Review label has been added to this PR.
Once all policies pass, the label will be removed automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants